class: inverse, left, bottom, hide_logo <!--background-image: url("images/U.S.-Air-Force.jpg")--> <!-- background-image: url("images/rso.png") --> <!-- background-position: top, left --> <!-- background-size: 50 --> # Introduction to Weibull Analysis ## 01 Mar 2022 --- ## Introduction Weibull analysis is a collection of graphical and statistical techniques used estimate important life characteristics of a product by determining which member of the Weibull distribution family best-fits the data Weibull analysis is but one of many <a target="_blank" href="https://data.princeton.edu/wws509/notes/c7.pdf">survival analysis</a> techniques used to describe a product's underlying failure process based on the observed time to event data <span class="explain"></span><span class="tooltip">aka: failure time data<br/>life data</span> This presentation introduces several aspects of Weibull analysis and how they are implemented on various types of reliability data - The Weibull distribution + Background, properties, and importance + Distribution functions + Parameters: shape `\(\beta\)`, scale `\(\eta\)`, location `\(\theta\)` + Relationship to other distributions - Weibull probability plots + Constructing the plot + Plotting the observed events + Fitting the Weibull model and estimating parameters `$$\newcommand\redbf[1]{\color{red}{\boldsymbol{#1}}}$$` `$$\newcommand\mbf[1]{{\boldsymbol{#1}}}$$` `$$\newcommand\greenbf[1]{\color{green}{\boldsymbol{#1}}}$$` `$$\newcommand\bluebf[1]{\color{blue}{\boldsymbol{#1}}}$$` `$$\newcommand\purplebf[1]{\color{purple}{\boldsymbol{#1}}}$$` --- ## About this presentation .panelset[.panel[.panel-name[Navigating this presentation] .pull-left[ Note the navbar above
+ Several slides in this presentation include panels that contain more information about a topic + Depending on your browser, the left and right arrow keys may be used to navigate the panels + Otherwise, just click the panel title to view the content on that panel This presentation also includes interactive elements + This <span class="explain"></span><span class="tooltip">When you hover over these tooltips a pop-up window appears showing additional information</span> is a text tooltip - hover your mouse over this symbol + Many graphs are also interactive, and display additional data when you hover over them - like the plot.ly graph on to the right <span class="explain"></span><span class="tooltip">You should see a three-dimensional surface plot - If you hover your mouse over the surface, a tooltip shows information about the surface at the the coordinates where your mouse is pointing. You can also rotate the graph in any direction or zoom in/out.<br/><br/>These features rely on your browser being able to properly render WebGL content. If you either don't see a graph, or if the graph responds very slowly, this is likely due to the WebGL settings in your browser - visit https://get.webgl.org for more info</span> For more navigation options press "h" at any time ] .pull-right[ <iframe src="images/p.html" width="100%" height="400" id="igraph" scrolling="no" seamless="seamless" frameBorder="0"></iframe> ] .panel[.panel-name[Details about this presentation] This is an HTML presentation created using the <a target=" " href="https://remarkjs.com/#1">Remarkjs</a> slideshow framework along with the following resources - <a target=" " href="https://rmarkdown.rstudio.com/">RMarkdown</a> - <a target=" " href="https://www.mathjax.org/">MathJax</a> - <a target=" " href="https://plotly.com/">Plotly</a> - <a target=" " href="https://imagemagick.org/index.php">Image Magick</a> - <a target=" " href="https://imagemagick.org/index.php">Font Awesome</a> The open-source code used to create these slides can be accessed from <a target=" " href="https://github.com/Auburngrads/weibull_analysis">this Github repo</a> Upon accessing this presentation, several server-side resources may take a few seconds to fully load - if things don't render properly or if you find an error, please <a target=" " href="https://github.com/Auburngrads/weibull_analysis/issues/new/choose">create an issue</a> ] ] ] --- class: inverse, left, bottom, hide_logo # The Weibull Distribution --- ## Background of the Weibull Distribution .pull-left[ The distribution is named after Ernst Hjalmar Waloddi Weibull (1887–1979), the Swedish engineer, scientist, and mathematician <span class="explain"></span><span class="tooltip">As a Swede, his surname should be pronounced as "Vay-bull" not "Why-bull"</span> The Weibull distribution was actually discovered by the French mathematician Maurice Rene Fréchet in the course of deriving the Fréchet distribution (now known as the inverse Weibull distribution) In his landmark paper, Weibull (1951) popularized the use of his namesake distribution + Weibull had hoped to publish in a prominent mathematics journal, but had to "settle" for an applied engineering journal + Initial reaction to the paper was sharply negative + In the 1970s, the U.S. Air Force and automotive industry began implementing the distribution and the methods described in the paper + Today, Weibull analysis is a one of the most the most comonly used methods for evaluating life data ] .pull-right[ <div class="row"> <div class="column"> <img src="https://upload.wikimedia.org/wikipedia/commons/0/09/Waloddi_Weibull_SPA.jpg" alt="Waloddi Weibull" style="width:100%"> </div> <div class="column"> <img src="https://upload.wikimedia.org/wikipedia/commons/a/a5/Frechet.jpeg" alt="Maurice Rene Fréchet" style="width:100%"> </div> </div> ] --- ## Properties of the Weibull Distribution The Weibull distribution is a member of a distribution family called "lifetime distributions" - These distributions describe continuous random variables (R.V.) defined over strictly positive values - `\(T \in \mathbb{R}^{+}\)` - There are many members of the lifetime distribution family (exponential, lognormal, loglogistic, gamma, Birmbaum-Saunders) - Likewise, there are many distributions that are not members of the lifetime distribution family (normal, logistic, smallest extreme value) Lifetime distributions have successfully served as population models for failure times arising from a wide range of products and failure mechanisms - In some cases there are probabilistic arguments based on the physics of failure that justify the choice of a model - More often, however, a model is chosen solely because of its demonstrated success in fitting failure data - <u>**This is why the Weibull distribution is so popular**</u> - it's flexible, and is capable of fitting many different failure patterns The Weibull is also a member of the extreme value distribution family - These models describe the time to failure of the weakest (or strongest) link in a "chain" of components - The Weibull is directly related to the smallest extreme value distribution - if `\(X \sim \text{WEIB}(\eta, \beta)\)` then `\(\log[X] \sim \text{SEV}(\mu = \log(\eta), \sigma = 1/\beta)\)` --- ## Weibull Distribution Functions - PDF & CDF .panelset[ .panel[.panel-name[Probability Density Function - PDF] .pull-left[
] .pull-right[ The probability density function (aka density function, or just density) defines the probability that the R.V. <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">T</span> is equal to <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">t</span>. $$ f(t) = \Pr(T = t) $$ The PDF for the Weibull distribution is expressed as $$ f(t) = \frac{\beta}{\eta}\left(\frac{t}{\eta}\right)^{\beta-1}\exp\left[-\left(\frac{t}{\eta}\right)^{\beta}\right] $$ ] <!--end .pull-right --> ] .panel[.panel-name[Cumulative Distribution Function - CDF] .pull-left[
] .pull-right[ The CDF is formally defined as the cumulative probability that the R.V. <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">T</span> is equal to or less than <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">t</span> $$ F(t) = \Pr(T \le t) $$ Less formally, <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">F(t)</span> is equal to the area under the curve of the density function in the interval <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">[0,t]</span>, i.e. $$ F(t) = \int_{0}^{t}f(u)du $$ The Weibull CDF is expressed as $$ F(t) = 1-\exp\left[-\left(\frac{t}{\eta}\right)^{\beta}\right] $$ ] <!--end .pull-right --> ] <!--end .panel --> .panel[.panel-name[CDF/PDF Relationship] <img src="https://lh3.googleusercontent.com/aq3Iw1r5gb3Ixys0Kpk0KEJ29s2rqvE6T9gQp4J58BC8j_ocow6YI83yJ_NneXgEIv1lBTO1VTDlVIxC57FPa73n6rRryOexaEfKOaX4R38onqB9tsOHBIrlq7KEZCtyiyAaJERVDfM=w2400"> ] ] <!--end .panelset --> --- ## Weibull Distribution Functions - Survival .panelset[.panel[.panel-name[Survival Function] .pull-left[
] .pull-right[ The survival function (aka reliability function, or complementary CDF) is formally defined as the probability that the R.V. <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">T</span> is greater than <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">t</span> $$ S(t) = \Pr(T > t) = 1-F(t) $$ Less formally, the survival function is equal to the area under the curve of the density function in the interval <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">(t,∞)</span>, i.e. $$ S(t) = \int_{t}^{\infty}f(u)du $$ The Weibull survival function is expressed as $$ S(t) = \exp\left[-\left(\frac{t}{\eta}\right)^{\beta}\right] $$ ] <!--end .pull-right --> ] <!--end .panel --> .panel[.panel-name[PDF/Survival Relationship] <img src="https://lh3.googleusercontent.com/pcMb_Q3xwSq8WdHhlD0zyUyVFWjkK_VwT5KZd2BM4FvI0-NoNgMcOAM7mMXupivB5pVQIQ9AbbrtXdlmfS4nPKhK643hxkpozGkcUzsrCaMDoH7M0ME4QeVOOphNfQf4UyJmh8h-QRA=w2400"> ] <!--end .panel --> ] <!--end .panelset --> --- ## Weibull Distribution Functions - Quantile .panelset[.panel[.panel-name[Quantile Function] .pull-left[
] .pull-right[ The Quantile function (aka the percent point function) is formally defined as the realization of the R.V. <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">T</span> that corresponds to the probability <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">p</span> $$ t(p)=\inf \Big\\{t\in\mathbb{R} :p\leq F(t)\Big\\} $$ The quantile function is the inverse of the CDF $$ t(p) = F^{-1}(t) $$ The Weibull quantile function is expressed as $$ t(p) = \eta\left(\ln\left[\frac{1}{1 - p}\right]\right)^{1/\beta} $$ ] ] .panel[.panel-name[Quantile CDF Relationship]
] .panel[.panel-name[Weibull Scale Parameter] .pull-left[ The quantile function is closely related to warranty periods as it is used to estimate the time at which `\(X\%\)` of units are expected to fail The Weibull scale parameter `\(\eta\)` is sometimes called the "characteristic life" parameter as it equates to a specific quantile value The scale parameter has the same units as `\(t\)` Setting `\(t = \eta\)` in the Weibull CDF shows that `\(F(t=\eta|\eta,\beta) \approx 0.632\)` This result does not depend on the value of `\(\eta\)` or `\(\beta\)` ] .pull-right[ $$ `\begin{aligned} F(t = \eta|\eta,\beta) &= 1 - \exp\bigg[-\bigg(\frac{\eta}{\eta}\bigg)^{\beta}\bigg]\\\\ &= 1 - \exp\bigg[-\bigg(1\bigg)^{\beta}\bigg]\\\\ &=1-\exp\bigg[-1\bigg]\\\\ &=1-0.368\\\\ &=0.632 \end{aligned}` $$ ] ] ] --- ## Weibull Distribution Functions - Hazard .panelset[.panel[.panel-name[Hazard Function] .pull-left[
] .pull-right[ The hazard function <span class="explain"></span><span class="tooltip"> aka the hazard rate function or failure rate function</span> is formally defined as the conditional probability that the R.V. <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">T</span> falls in some interval <span class="mjx-char MJXc-TeX-math-I" style="display:inline;">t + Δt</span> $$ h(t) = \lim_{\Delta t \to 0} \frac{\Pr(t < T \le t+\Delta t\vert T \ge t)}{\Delta t} $$ The hazard function is equal to the ratio of the PDF and the survival function $$ h(t) = f(t) / S(t) $$ Looking at the Weibull PDF, we see both the <font color="red">hazard</font> function and the <font color="blue">survival</font> function $$ f(t) = \redbf{\frac{\beta}{\eta}\left(\frac{t}{\eta}\right)^{\beta-1}}\bluebf{\exp\left[-\left(\frac{t}{\eta}\right)^{\beta}\right]} $$ ] ] .panel[.panel-name[PDF/Survival/Hazard Relationship] <img src="https://lh3.googleusercontent.com/Zs7zEpXJ664xTQiRTrHzGpUNPPBXQ9HO9y_PAgvwCcuPbGKOEoESBx9l417efoxP1sfXohX-5lPD5PGVudyGmLkbfZl0wddGHRESU5buM5ankgZBZAwn3055eIpJdMN_6kHc-BiemYM=w2400"> ] ] --- ## Weibull Distribution Properties .panelset[.panel[.panel-name[Shape Parameter β] The shape parameter determines the variance of the Weibull distribution and impacts the shape of all of the functions When `\(0 < \beta < 1\)` - The most probable observation is `\(0\)` and steeply decreases before leveling out When `\(\beta = 1\)` - The distribution is equivalent to the exponential distribution - The hazard rate is constant as the distribution shares the memory-less property When `\(\beta < 1\)` - The hazard rate transitions to an increasing function - In the literature, authors often state that `\(\beta < 1\)` implies that the observed failure represent a "wear out" or end of life type of failure - In reality, this just implies that the distribution has a lower variance ] .panel[.panel-name[Impact of β on Weibull functions] <img src="https://lh3.googleusercontent.com/-14JBhtF5kh49PxT6QiwgQCcwhc07tBXpjxKxTQzwywhJQigkeQ53WcG1pr6Vo484hJqn1J6hPayZ73qU1n9Ji-j0tikBz0AFiqINkjMKY5KJbln2WmdnG3lZhOKGWk_BOSV0qXQX4E=w1920-h1080"> ] .panel[.panel-name[Other properties] .pull-left[ Mean: `\({\displaystyle t_{\text{mean}}=\mu =\eta \,\Gamma (1+1/\beta)}\)` Median: `\({\displaystyle t_{\text{median}} = \eta (\ln 2)^{1/\beta}}\)` Mode: `\({\displaystyle t_{\text{mode}}={\begin{cases}\eta \left({\frac {\beta-1}{\beta}}\right)^{1/\beta}\,,&\beta>1,\\0,&\beta\leq 1.\end{cases}}}\)` Variance: `\({\displaystyle \sigma^{2}=\eta ^{2}\left[\Gamma \left(1+{\frac {2}{\beta}}\right)-\left(\Gamma \left(1+{\frac {1}{\beta}}\right)\right)^{2}\right]}\)` ] .pull-right[ Skewness: `\({\displaystyle \mu_{3}={\frac {\Gamma (1+3/\beta)\eta ^{3}-3\mu \sigma ^{2}-\mu ^{3}}{\sigma ^{3}}}}\)` Entropy: `\({\displaystyle H = \gamma (1-1/\beta)+\ln(\eta /\beta)+1}\)` Moment Generating Function: `\({\displaystyle M = \sum _{n=0}^{\infty }{\frac {t^{n}\eta ^{n}}{n!}}\Gamma (1+n/\beta),\ \beta\geq 1}\)` Characteristic Function: `\({\displaystyle \varphi=\sum _{n=0}^{\infty }{\frac {(it)^{n}\eta ^{n}}{n!}}\Gamma (1+n/\beta)}\)` ] <center>Note: <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi mathvariant="normal">Γ<!-- Γ --></mi> <mo stretchy="false">(</mo> <mo>⋅<!-- ⋅ --></mo> <mo stretchy="false">)</mo> </math> refers to the <a target="_blank" href="https://mathworld.wolfram.com/GammaFunction.html">Complete Gamma Function</a></center> ] ] --- ## Distribution Function Table Each cell in the table shows the expressions used to transform from the function at the top to the function on the left | | `\(F(t)\)` | `\(f(t)\)` | `\(S(t)\)` | `\(h(t)\)` | `\(H(t)\)`| |------|-------------------------|------------------------------------|---------------------- |-------------------------------------|-------| | `\(F(t)\)`| | `\(\displaystyle\int_0^{t}f(u)du\)` | `\(1-S(t)\)` | `\(\displaystyle 1-\exp\left[-\int_0^{t} h(u)du\right]\)`| `\(\displaystyle 1-\exp\left[-H(t)\right]\)`| | `\(f(t)\)`| `\(\displaystyle\frac{d}{dt}F(t)\)` | | `\(\displaystyle-\frac{d}{dt}S(t)\)`| `\(\displaystyle h(t)\cdot\exp\left[-\int_0^{t} h(u)du\right]\)` | `\(\displaystyle -\frac{dH(t)/dt}{\exp[H(t)]}\)` | | `\(S(t)\)`| `\(1-F(t)\)` | `\(\displaystyle \int_t^{\infty}f(u)du\)` | | `\(\displaystyle\exp\left[-\int_0^{t} h(u)du\right]\)` | `\(\displaystyle \exp[-H(t)]\)`| | `\(h(t)\)`| `\(\displaystyle \frac{dF(t)/dt}{1-F(t)}\)`| `\(\displaystyle \frac{f(t)}{\int_t^{\infty}f(u)du}\)`| `\(\displaystyle -\frac{d}{dt}\ln \left[S(t)\right]\)`| | `\(\displaystyle \frac{d}{dt}H(t)\)`| | `\(H(t)\)`| `\(\displaystyle-\ln[1-F(t)]\)` | `\(\displaystyle -\ln\left[\int_t^{\infty}f(u)du\right]\)` | `\(\displaystyle -\ln[S(t)]\)` | `\(\displaystyle \int_0^t h(u)du\)` | | --- class: inverse, left, bottom, hide_logo # Probability Plotting --- ## Overview of Probability Plotting Probability plotting is a graphical method of fitting data to a chosen univariate probability model The CDF of the chosen model is "linearized" or rearranged to the form of the equation of a line $$ `\begin{aligned} G(\widehat{F}) &= m\cdot g(t) +b\\\\ y &= m\cdot x + b \end{aligned}` $$ where: - `\(\widehat{F}\)` is an estimate of the failure probability (unreliability) - `\(t\)` are the times at which an event was observed (failure, suspension) - `\(G(\cdot)\)` and `\(g(\cdot)\)` are transformations, specific to the chosen distribution If the plotted points fall "roughly" on a straight line, the distribution provides an adequate fit to the data <span class="explain"></span><span class="tooltip">The word "adequate" means a probability plot on it's own is not sufficient to conclude that the chosen distribution provides the best description of the underlying process that generated the data<br/><br/>There may be many models for which the plotted points fall "roughly" on a straight line - in this situation numerical methods (such as maximum likelihood) are used to determine which model provides the best fit<br/><br/>Probability plots are helpful for quickly <u>rejecting</u> models that provide a poor fit to the data as the plotted points will not fall "roughly" on a straight line</span> The parameter values (i.e. slope and intercept) of the best fit line can be determined by graphical estimation, least squares optimization, or maximum likelihood estimation --- ## How to linearize the Weibull CDF .panelset[ .panel[.panel-name[Step 1] $$ `\begin{aligned} \redbf{F(t|\beta, \eta)} &\redbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\redbf{\text{Start with CDF for a Weibull Distribution}}& \Longleftarrow\\[10pt] \mbf{\widehat{F}} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}&\\[10pt] \mbf{1-\widehat{F}} &\mbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \mbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[1-\widehat{F}\bigg]}&\mbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \mbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\mbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \mbf{\text{Move negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]} &\mbf{=\beta\ln[t] - \beta\ln[\eta]}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}& \end{aligned}` $$ ] .panel[.panel-name[Step 2] $$ `\begin{aligned} \mbf{F(t|\beta, \eta)} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Start with CDF for a Weibull Distribution}}&\\[10pt] \redbf{\widehat{F}} &\redbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\redbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}& \Longleftarrow\\[10pt] \mbf{1-\widehat{F}} &\mbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \mbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[1-\widehat{F}\bigg]}&\mbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \mbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\mbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \mbf{\text{Move negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]} &\mbf{=\beta\ln[t] - \beta\ln[\eta]}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}& \end{aligned}` $$ ] .panel[.panel-name[Step 3] $$ `\begin{aligned} \mbf{F(t|\beta, \eta)} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Start with CDF for a Weibull Distribution}}&\\[10pt] \mbf{\widehat{F}} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}&\\[10pt] \redbf{1-\widehat{F}} &\redbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \redbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}& \Longleftarrow\\[10pt] \mbf{\ln\bigg[1-\widehat{F}\bigg]}&\mbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \mbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\mbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \mbf{\text{Move negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]} &\mbf{=\beta\ln[t] - \beta\ln[\eta]}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}& \end{aligned}` $$ ] .panel[.panel-name[Step 4] $$ `\begin{aligned} \mbf{F(t|\beta, \eta)} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Start with CDF for a Weibull Distribution}}&\\[10pt] \mbf{\widehat{F}} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}&\\[10pt] \mbf{1-\widehat{F}} &\mbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \mbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}&\\[10pt] \redbf{\ln\bigg[1-\widehat{F}\bigg]}&\redbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\redbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}& \Longleftarrow\\[10pt] \mbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\mbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \mbf{\text{Move negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]}& \mbf{=\beta\ln[t] - \beta\ln[\eta]}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}& \end{aligned}` $$ ] .panel[.panel-name[Step 5] $$ `\begin{aligned} \mbf{F(t|\beta, \eta)} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Start with CDF for a Weibull Distribution}}&\\[10pt] \mbf{\widehat{F}} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}&\\[10pt] \mbf{1-\widehat{F}} &\mbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \mbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[1-\widehat{F}\bigg]}&\mbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \redbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\redbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \redbf{\text{Move negative sign}\;(-)\;\text{over}}& \Longleftarrow\\[10pt] \mbf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]} &\mbf{=\beta\ln[t] - \beta\ln[\eta]}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}& \end{aligned}` $$ ] .panel[.panel-name[Step 6] $$ `\begin{aligned} \mbf{F(t|\beta, \eta)} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Start with CDF for a Weibull Distribution}}&\\[10pt] \mbf{\widehat{F}} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}&\\[10pt] \mbf{1-\widehat{F}} &\mbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \mbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[1-\widehat{F}\bigg]}&\mbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \mbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\mbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \mbf{\text{Move negative sign}\;(-)\;\text{over}}&\\[10pt] \redbf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]} &\redbf{=\beta\ln[t] - \beta\ln[\eta]}&\redbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\Longleftarrow \end{aligned}` $$ ] .panel[.panel-name[Step 7] $$ `\begin{aligned} \mbf{F(t|\beta, \eta)} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Start with CDF for a Weibull Distribution}}&\\[10pt] \mbf{\widehat{F}} &\mbf{= 1 - \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}&\mbf{\text{Substitute nonparametric estimate}\;\widehat{F} \text{ for } F(t|\beta, \eta)}&\\[10pt] \mbf{1-\widehat{F}} &\mbf{= \exp\bigg[-\bigg(\frac{t}{\eta}\bigg)^{\beta}\bigg]}& \mbf{\text{Move}\;1\;\text{and negative sign}\;(-)\;\text{over}}&\\[10pt] \mbf{\ln\bigg[1-\widehat{F}\bigg]}&\mbf{=-\bigg(\frac{t}{\eta}\bigg)^{\beta}}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \mbf{\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]}&\mbf{=\bigg(\frac{t}{\eta}\bigg)^{\beta}}& \mbf{\text{Move negative sign}\;(-)\;\text{over}}&\\[10pt] \bluebf{\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]} &\mbf{=}\;\greenbf{\beta} \redbf{\ln[t]} - \purplebf{\beta\ln[\eta]}&\mbf{\text{Take}\;\ln[\cdot]\;\text{of both sides}}&\\[10pt] \bluebf{y}\hspace{30pt}&\mbf{=}\;\greenbf{m}\cdot \redbf{x} \hspace{5pt}+ \hspace{5pt}\purplebf{b}&\text{Equation of a line in slope/intercept form}& \Longleftarrow \end{aligned}` $$ ] ] --- class: inverse, left, bottom, hide_logo # Elements of Weibull Plots --- ## Elements of Weibull Plots .pull-left[ <img src="data:image/png;base64,#C:/Users/Aubur/github/auburngrads/weibull_analysis/images/build_weibull/points.png" width="800" style="display: block; margin: auto;" /> ] .pull-right[ <br/> <red>Plotted Points representing the "observed unreliabilities"</red> - <red>X-axis:</red> `\(\redbf{t_i, i = 1,2,\ldots,N}\)`<br/> <red>ordered event times</red> - <red>Y-axis:</red> `\(\redbf{\hat{F}(t_i), i = 1,2,\ldots,N}\)`<br/> <red>non-parametric estimate of the Weibull CDF</red> ] --- ## Elements of Weibull Plots .pull-left[ <img src="data:image/png;base64,#C:/Users/Aubur/github/auburngrads/weibull_analysis/images/build_weibull/axes.png" width="800" style="display: block; margin: auto;" /> ] .pull-right[ <br/> Plotted Points representing the "observed unreliabilities" - X-axis: `\(t_i, i = 1,2,\ldots,N\)`<br/> ordered event times - Y-axis: `\(\hat{F}(t_i), i = 1,2,\ldots,N\)`<br/> non-parametric estimate of the Weibull CDF <red>Axes: transformed according to a <u>linearized Weibull CDF</u></red> ] --- ## Elements of Weibull Plots .pull-left[ <img src="data:image/png;base64,#C:/Users/Aubur/github/auburngrads/weibull_analysis/images/build_weibull/mle.png" width="800" style="display: block; margin: auto;" /> ] .pull-right[ <br/> Plotted Points representing the "observed unreliabilities" - X-axis: `\(t_i, i = 1,2,\ldots,N\)`<br/> ordered event times - Y-axis: `\(\hat{F}(t_i), i = 1,2,\ldots,N\)`<br/> non-parametric estimate of the Weibull CDF Axes: transformed according to a linearized Weibull CDF <red>Best fit line representing the predicted values from the Weibull model</red> - <red>Estimated using maximum likelihood</red> `\(\redbf{\widehat{\beta_{_{MLE}}},\widehat{\eta_{_{MLE}}}}\)` <br/> - <red>Estimated using ordinary least squares</red> `\(\redbf{\widehat{\beta_{_{OLS}}},\widehat{\eta_{_{OLS}}}}\)` ] --- ## Elements of Weibull Plots .pull-left[ <img src="data:image/png;base64,#C:/Users/Aubur/github/auburngrads/weibull_analysis/images/build_weibull/ci.png" width="800" style="display: block; margin: auto;" /> ] .pull-right[ <br/> Plotted Points representing the "observed unreliabilities" - X-axis: `\(t_i, i = 1,2,\ldots,N\)`<br/> ordered event times - Y-axis: `\(\hat{F}(t_i), i = 1,2,\ldots,N\)`<br/> non-parametric estimate of the Weibull CDF Axes: transformed according to a linearized Weibull CDF Best fit line representing the predicted values from the Weibull model - Estimated using maximum likelihood `\(\widehat{\beta_{_{MLE}}},\widehat{\eta_{_{MLE}}}\)` <br/> - Estimated using ordinary least squares `\(\widehat{\beta_{_{OLS}}},\widehat{\eta_{_{OLS}}}\)` <red>Upper/lower</red> `\(\redbf{100(1-\alpha)\%}\)` <red>confidence intervals</red> ] --- ## Methods for generating Weibull plot axes .panelset[ .panel[.panel-name[Linear (true) axes] .pull-left[ <img src="data:image/png;base64,#C:/Users/Aubur/github/auburngrads/weibull_analysis/images/build_weibull/both_axes_2-crop.png" width="800" style="display: block; margin: auto;" /> ] .pull-right[ - In this method the nonparametric estimate of the CDF and the event times are transformed according to the linearized Weibull CDF $$ `\begin{aligned} G(\widehat{F}) &=\ln\bigg[\ln\bigg[\frac{1}{1-\widehat{F}}\bigg]\bigg]\\\\ g(t) &= \ln[t] \end{aligned}` $$ - These values are plotted on linear axes - The advantage is that values plotted on these axes can be used to graphically estimate `\(\eta\)` and `\(\beta\)` - The disadvantage is that the viewer may not understand what the values along the y-axis mean ] ] .panel[.panel-name[Transformed (not so true) axes] .pull-left[ <img src="data:image/png;base64,#C:/Users/Aubur/github/auburngrads/weibull_analysis/images/build_weibull/both_axes_1-crop.png" width="800" style="display: block; margin: auto;" /> ] .pull-right[ This method has been popularized through the use of <a target="_blank" href="https://www.weibull.com/GPaper/">**Weibull Plotting papers**</a> Note the values on the second y-axis (right side) - these values correspond to failure probabilities (left side) that have been transformed according to the linearized Weibull CDF <span class="explain"></span><span class="tooltip">Examples:<br/> $$ `\begin{aligned} 1.933 &\approx \log\bigg[\log\bigg[\frac{1}{1-0.999}\bigg]\bigg]\\\\ -5.296 &\approx \log\bigg[\log\bigg[\frac{1}{1-0.005}\bigg]\bigg] \end{aligned}` $$ </span> This method overwrites the true y-axis values with their corresponding probabilities <span class="explain"></span><span class="tooltip">For example, 1.933 is overwritten with 0.999<br/><br/>As result, the y-axis on Weibull plots appears to be logarithmic - but it's really a linear axis with some clever hand waving
</span> - The advantage of this method is that the raw failure times and nonparametric estimates can be plotted directly and the resulting plot is more easily interpretable - The disadvantage of this method is that the plotted points cannot be used to graphically estimate value of the shape parameter `\(\beta\)` ] ] ] --- ## Generating the plotted points .panelset[ .panel[.panel-name[Overview] Regardless of how the axes are drawn, the plotted points are computed in a similar manner - X-axis: observed event times (failure, suspension, or other event) - Y-axis: nonparametric estimate of the failure probability (CDF) at the observed event times The nonparametric estimate of the CDF `\(\widehat{F}\)` plays a key role in probability plotting - One cannot observe the reliability or unreliability of a item - events are the only observable source of information - Reliability or unreliability values must be estimated from the data - The following panels discuss some considerations for computing the values of the plotted points ] .panel[.panel-name[X-coordinates] .pull-left[ The x-coordinate of each plotted point is determined by how the observed event is categorized - Exact failure: "exact" failure time observed `\(t_f = t\)` - Left censored: failure is discovered at first inspection - exact failure time not known `\(t_f \in (0,t_{1})\)` - Interval censored: failure occurs between inspections and is discovered at inspection `\(i=2,\cdots,n\)` - exact failure time not known `\(t_f \in (t_{i},t_{i+1}), i \ge 1\)` - Right censored: failure not observed at final `\(n^{th}\)` inspection - exact failure time not known `\(t_f \in (t_{n},\infty)\)` Data sets that include only exact failures are called complete data sets ] .pull-right[ <img src="data:image/png;base64,#images/censored2.png" width="989" style="display: block; margin: auto;" /> ] ] .panel[.panel-name[Y-coordinates] The y-coordinate of each point is a nonparametric estimates of the CDF corresponding to the observed event times Many nonparametric estimators have been developed, the choice of which to use is driven by the type of censoring (suspensions) present in the data - Estimators for complete data sets + Median-Ranks plotting position + Hazen plotting position + Weibull plotting position - Estimators for right censored data sets + Kaplan-Meier estimator (aka the product limit estimator) + Modified median-ranks estimator - Estimators for data with generalized censoring + Turnbull's Estimator + Generalized Kaplan-Meier estimator ] .panel[.panel-name[Plotting Positions] Probability plotting positions express the non-exceedance probability of the CDF for the `\(i^{th}\)` ascending data value The generic plotting position formula is expressed as $$ \widehat{F(t_{i})}=\frac{i-a}{n+1-2a} $$ - where + `\(i\)` is an index of the ordered observations (smallest `\(\rightarrow\)` largest) + `\(n\)` is the number of observations + `\(a\)` is the plotting position parameter <span class="explain"></span><span class="tooltip">The value of `\(a\)` is chosen to produce approximately unbiased estimates of `\(F(t_{i})\)` for an assumed distribution</span> Various formulae have been developed to correspond with specific distributions <span class="explain"></span><span class="tooltip"> <br/> The formulae used most often in practice are: <br/> $$ `\begin{aligned} \text{Hazen }(a = 0.5):\; &\widehat{F(t_{i})}=\frac{i-0.5}{n}\\\\ \text{Chegodayev }(a = 0.3):\; &\widehat{F(t_{i})}=\frac{i-0.3}{n+0.4}\\\\ \text{Weibull }(a = 0):\; &\widehat{F(t_{i})}=\frac{i}{n+1}\\\\ \end{aligned}` $$ Note: the Chegodayev plotting position is also known as "Median-Ranks" </span> ] ] --- class: inverse, left, bottom, hide_logo # Fitting models to data --- ## Fitting models to data - Overview .pull-left[ Once the data are plotted, grade school algebra <span class="explain"></span><span class="tooltip">The Weibull shape parameter is represented by the line slope, thus, `\(\widehat{\beta} = \text{rise} / \text{run}\; =(y_2 - y_1)/(x_2 - x_1)\)`</br>As noted on a previous slide the intercept of the best fit line is expressed as `\(b = \beta\log(\eta)\)`, thus the Weibull scale parameter is `\(\widehat{\eta} = \exp(\text{intercept / }\widehat{\beta})\)`</span>can be used to graphically estimate the parameters that correspond to the model that best-fits the data In most cases the parameters are estimated numerically via least squares optimization or maximum likelihood estimation The best-fit distribution is then overlayed on the Weibull plot to illustrate the degree to which it fits the the data Strictly speaking, a Weibull plot isn't needed - but it's a nice way to illustrate the result of an analysis The plot to the right shows best fit Weibull distribution as determined by least squares optimization - The vertical lines show the distance between the plotted points and the model - Least squares estimates the best fit line by minimizing the sum of these squared distances ] .pull-right[ <img src="data:image/png;base64,#index_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> ] --- ## Maximum Likelihood Estimation .panelset[ .panel[.panel-name[Intro] ML estimation is a versatile method for fitting statistical models to data - Can be applied to a wide variety of statistical models and data structures - Provides a numerical procedure to discern which model best-fits a data set (among a set of selected models) ML estimation produces efficient and consistent estimators under certain regularity conditions<span class="explain"></span><span class="tooltip"><b><u>Maximum Likelihood Regularity Conditions</u></b><br/><br/> The support region for a selected model does not depend on <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ<!-- θ --></mi></math> <br/><br/> The parameters are identifiable (i.e., for <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>θ<!-- θ --></mi><mn>1</mn></msub><mo>≠<!-- ≠ --></mo><msub><mi>θ<!-- θ --></mi><mn>2</mn></msub><mo>,</mo><mspace width="thickmathspace"/><mi>f</mi><mo stretchy="false">(</mo><mi>t</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">|</mo> </mrow><msub><mi>θ<!-- θ --></mi><mn>1</mn></msub><mo stretchy="false">)</mo> <mo>≠<!-- ≠ --></mo><mi>f</mi><mo stretchy="false">(</mo><mi>t</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">|</mo></mrow><msub><mi>θ<!-- θ --></mi><mn>2</mn></msub><mo stretchy="false">)</mo><mo>,</mo><mspace width="thickmathspace"/><mi mathvariant="normal">∀<!-- ∀ --></mi><mi>t</mi></math> ) <br/><br/><math xmlns="http://www.w3.org/1998/Math/MathML"><mi>f</mi><mo stretchy="false">(</mo><mi>t</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">|</mo></mrow><munder><mi>θ<!-- θ --></mi><mo>_<!-- _ --></mo></munder><mo stretchy="false">)</mo></math> has a <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mn>3</mn><mrow class="MJX-TeXAtom-ORD"><mi>r</mi><mi>d</mi></mrow></msup></math> mixed partial derivative <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>E</mi><mrow><mo>[</mo><mfrac><mrow><msup><mi mathvariant="normal">∂<!-- ∂ --></mi><mrow class="MJX-TeXAtom-ORD"><mn>2</mn></mrow></msup><mi>log</mi><mo>⁡<!-- --></mo><mo stretchy="false">(</mo><mi>f</mi><mo stretchy="false">(</mo><mi>t</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">|</mo></mrow><mi>θ<!-- θ --></mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><mrow><mi mathvariant="normal">∂<!-- ∂ --></mi><mi>θ<!-- θ --></mi><mo stretchy="false">(</mo><mi mathvariant="normal">∂<!-- ∂ --></mi><mi>θ<!-- θ --></mi><msup><mo stretchy="false">)</mo><mi>T</mi></msup></mrow></mfrac><mo>]</mo></mrow><mo>=</mo><mfrac><mrow><msup><mi mathvariant="normal">∂<!-- ∂ --></mi><mn>2</mn></msup><mi>E</mi><mrow><mo>[</mo> <mrow><mi>log</mi><mo>⁡<!-- --></mo><mo stretchy="false">(</mo><mi>f</mi><mo stretchy="false">(</mo><mi>t</mi><mrow class="MJX-TeXAtom-ORD"><mo stretchy="false">|</mo></mrow><mi>θ<!-- θ --></mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><mo>]</mo></mrow></mrow><mrow><mi mathvariant="normal">∂<!-- ∂ --></mi><mi>θ<!-- θ --></mi><mo stretchy="false">(</mo><mi mathvariant="normal">∂<!-- ∂ --></mi><mi>θ<!-- θ --></mi><msup><mo stretchy="false">)</mo><mi>T</mi></msup></mrow></mfrac></math> <br/><br/>Elements of the Hessian matrix <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow class="MJX-TeXAtom-ORD"><mi mathvariant="script">I</mi></mrow><mrow class="MJX-TeXAtom-ORD"><mi>θ<!-- θ --></mi></mrow></msub></math> are finite <br/><br/><math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow class="MJX-TeXAtom-ORD"><mi mathvariant="script">I</mi></mrow><mrow class="MJX-TeXAtom-ORD"><mi>θ<!-- θ --></mi></mrow></msub></math> is a positive-definite matrix <br/><br/>The value of <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>θ<!-- θ --></mi><mrow class="MJX-TeXAtom-ORD"><mi>M</mi><mi>L</mi><mi>E</mi></mrow></msub></math> is on the interior of the parameter space </span> - Efficient - Estimates `\(\mathbf{\underline{\theta}}=\theta_{1},\theta_{2},...\)` in the "best-manner" - Consistent - `\(\text{as}\;n\rightarrow\infty, \;\;f(\hat{\theta}_{_{MLE}}\xrightarrow{L}\theta)\)` ] .panel[.panel-name[Properties] The likelihood is equal to the joint probability of the data `$$\mathscr{L}(\underline{\theta}|\underline{x})=\sum_{i=1}^{n}\mathscr{L}_{i}(\underline{\theta}|x_i)=f(\underline{x}|\underline{\theta})=\prod_{i=1}^{n}f(x_{i}|\underline{\theta}),\;\;\text{if}\;x_{i}\; iid$$` Properties of the Likelihood Function `\(\mathscr{L}\)` 1. `\(\mathscr{L}(\theta|\underline{x})\ge 0\)` 2. `\(\mathscr{L}(\theta|\underline{x})\)` is not a pdf i.e. `\(\int \mathscr{L}(\theta|\underline{x})\;d\theta \ne 1\)` 3. Suggests (relatively) which values of `\(\theta\)` are more likely to have generated the observed data `\(\underline{x}\)` (assuming the chosen parametric model is correct) 4. If it exists, we say that the value of `\(\underline{\theta}\)` that maximizes `\(\mathscr{L}(\underline{\theta}|\underline{x})\)` is the maximum likelihood estimator (denoted `\(\hat{\theta}_{_{MLE}}\)`) 5. We often try to find `\(\hat{\theta}_{_{MLE}}\)` by maximizing the log-likelihood function `\(\mathcal{L}(\underline{\theta}|\underline{x})=\log\Big(\mathscr{L}(\underline{\theta}|\underline{x})\Big)\)` ] .panel[.panel-name[Definition] Both `\(f(t|\underline{\theta})\)` and `\(\mathscr(\underline{\theta})\)` start with a distributional assumption .pull-left[ `\(f(\underline{x}|\underline{\theta})\)` - Returns the probability of observing data `\(\underline{x}=x_1,...,x_n, \;\;n\in(1,\infty)\)` from a specified distribution `\(f(x|\theta)\)` <span class="explain"></span><span class="tooltip"><b><u>This statement comes with two assumptions</u></b><br/><br/>1. We know (or at least have specified) a functional form values for `\(\theta\)`<br/><br/>2. We know (or at least have specified) values for `\(\theta\)`</span> - Is a function of `\(\underline{x}\)` assuming `\(\underline{\theta}=\theta_{1},\theta_{2},...\)` are <focus>known</focus> - What data `\(\underline{x}\)` are most likely to be produced by a distribution with parameters `\(\underline{\theta}\)`? - `\(f(x_i|\underline{\theta})\)` is the probability density associated with observation `\(x_i\)` ] .pull-right[ `\(\mathscr{L}(\underline{\theta}|\underline{x})\)` - Returns the likelihood that `\(\underline{\theta}\)` are the parameters that produced `\(\underline{x}=x_1,...,x_n, \;\;n\in(1,\infty)\)` from a specified distribution of the form `\(f(x|\theta)\)` - Is a function of `\(\underline{\theta}\)` assuming `\(\underline{x}=x_{1},...,x_{n}\)` has already been observed - What values of `\(\underline{\theta}\)` are most likely to have produced `\(\underline{x}\)`? ] ] .panel[.panel-name[Definitions 2] The Likelihood Function and Its Maximum - The value of the likelihood function `\(\mathscr{L}(\underline{\theta}|\underline{t})\)` depends on 1. The assumed parametric model 2. The observed data - The total likelihood is comprised of the contributions from every observation + For observations `\(t_i, i = 1,\cdots,n\)`, the model with the highest joint probability is the model that is most likely to have generated the observations + For a single observation, the model providing the greatest contribution to the total likelihood may not be the correct model + As the number of observatons is increased, more information is obtained and it becomes easier to differentiate which model best-fits the data and best describes the underlying failure process ] .panel[.panel-name[Reliability Data] Likelihood Contributions For Reliability Data - For failure data, each observation makes one of four contributions to the likelihood function `$$\mathscr{L}_{i}(\underline{\theta}|t_{i})=\begin{cases} S(t_{i}) &\mbox{for a right censored observation}\\F(t_{i}) &\mbox{for a left censored observation}\\F(t_{i})-F(t_{i-1}) &\mbox{for an interval censored observation}\\\lim\limits_{\Delta_i\rightarrow 0} \frac{(F(t_{i})-\Delta_{i})-F(t_{i})}{\Delta_{i}} &\mbox{for an exact" observation}\end{cases}$$` - Thus, the total likelihood function may be expressed as `$$\mathscr{L}(\underline{\theta}|\underline{t})=C\prod_{i=1}^{n} \mathscr{L}_{i}(\underline{\theta}|t_i) =C\prod_{i=1}^{m+1}\Big(F(t_{i})\Big)^{l_{i}}\Big(F(t_{i})-F(t_{i-1})\Big)^{d_{i}}\Big(1-F(t_{i})\Big)^{r_{i}}$$` - where + `\(l_i=1\)` if `\(t_i\)` is a left censored observation (0 otherwise) + `\(d_i=1\)` if `\(t_i\)` is an interval censored observation (0 otherwise) + `\(r_i=1\)` if `\(t_i\)` is a right censored observation (0 otherwise) + `\(n = \sum_{j=1}^{m+1}(l_{j}+d_{j}+r_{j})\)` ] ] --- ## Example: Complete Data Set .panelset[ .panel[.panel-name[Overview] .pull-left[ .bold[Motivation] - Fatigue is known to affect the service life of deep-groove ball bearings - Higher stresses further reduces the fatigue life of the ball bearings - A failure time regression model was developed to describe the effect that higher levels of stress have on the fatigue life of these bearings - However, there was disagreement within the industry on whether the estimated parameter values currently being used were accurate - Additional tests were needed to reduce the uncertainty about the parameter values used to model fatigue life as a function of applied stress ] .pull-right[ .bold[Study Objectives] - Test `\(n = 23\)` ball bearings at a key stress levels to augment the existing failure data - The additional observations should result in better estimates of the parameter values .bold[Analysis] - The data are the number of accumulated fatigue cycles (in millions) when each failure was observed - The applied stress applied to each bearing in this test is unknown - must assume that the stress applied to all 23 bearings was equivalent ] ] .panel[.panel-name[Data]
] .panel[.panel-name[Event Plots] .pull-left[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#index_files/figure-html/unnamed-chunk-28-1.png" alt="Event plot of the ball bearing failure data" /> <p class="caption">Event plot of the ball bearing failure data</p> </div> ] .pull-right[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#index_files/figure-html/unnamed-chunk-29-1.png" alt="Histogram of the ball bearing failure data" /> <p class="caption">Histogram of the ball bearing failure data</p> </div> ] ] .panel[.panel-name[Likelihood] .pull-left[ The likelihood function for this data set is expressed as $$ `\begin{aligned} \mathscr{L}(DATA|f,\eta,\beta) &= \prod_{i=1}^{23}\frac{\beta}{\eta}\bigg(\frac{t_{i}}{\eta}\bigg)\exp\bigg[-\bigg(\frac{t_i}{\eta}\bigg)^{\beta}\bigg]\\\\ &=\frac{\beta}{\eta}\bigg(\frac{t_{1}}{\eta}\bigg)\exp\bigg[-\bigg(\frac{t_1}{\eta}\bigg)^{\beta}\bigg]\times\cdots\times\frac{\beta}{\eta}\bigg(\frac{t_{23}}{\eta}\bigg)\exp\bigg[-\bigg(\frac{t_{23}}{\eta}\bigg)^{\beta}\bigg] \end{aligned}` $$ The objective is to find the values of `\(\beta\)` and `\(\eta\)` that maximize this function The surface to the right shows the maximum value and this tooltip shows R code used to numerically find the optimal values ] ] .panel[.panel-name[Weibull Plot] ] ] --- ## Example data .panelset[ .panel[.panel-name[Overview] This is some text ] .panel[.panel-name[Data] ] .panel[.panel-name[Likelihood Function] ] .panel[.panel-name[Likelihood Surface] ] ] --- ## References Weibull, W. (1951). "A Statistical Distribution Function of Wide Applicability". In: _Journal of Applied Mechanics_ 18.3, pp. 293-297. DOI: 10.1115/1.4010337. <URL: https://doi.org/10.1115/1.4010337>.